W10. Singular Value Decomposition (SVD)
1. Summary
1.1 Introduction and Motivation
Eigenvalue decomposition is one of the most powerful tools in linear algebra, but it has a significant limitation: it applies only to square matrices and, even then, only to diagonalizable ones. Many of the most important matrices in applications — tall data matrices, wide system matrices, rectangular transformations — fall outside this framework entirely. We need a universal factorization that works for any matrix, of any shape and any rank.
There is a second, subtler limitation. Even for symmetric matrices, eigenvalue decomposition gives us a single basis of eigenvectors that simultaneously diagonalizes the matrix in both the domain and the codomain, because for a symmetric matrix \(A = PDP^{-1}\) the input and output spaces are the same. But for a general rectangular matrix \(A \in \mathbb{R}^{m \times n}\), the domain is \(\mathbb{R}^n\) and the codomain is \(\mathbb{R}^m\) — two completely different spaces. We need two orthonormal bases: one for the domain and one for the codomain.
The Spectral Theorem tells us that every real symmetric matrix \(S = S^T\) has an eigendecomposition \(S = Q\Lambda Q^T\) with \(Q\) orthogonal. This is a beautiful result, but it is special: symmetry is a strong assumption. What we want is something equally clean for arbitrary matrices.
Singular Value Decomposition (SVD) is exactly that universal method. It factors any real matrix \(A \in \mathbb{R}^{m \times n}\) as a product of three matrices — an orthogonal matrix, a diagonal matrix, and another orthogonal matrix — capturing the complete geometric and algebraic structure of \(A\) in a way that eigenvalue decomposition cannot. Every real matrix has an SVD. No condition on shape, rank, or symmetry is required.
1.2 SVD Basics
The fundamental result is:
Theorem (SVD Existence): For every matrix \(A \in \mathbb{R}^{m \times n}\) with rank \(r\), there exist orthogonal matrices \(U \in \mathbb{R}^{m \times m}\), \(V \in \mathbb{R}^{n \times n}\), and a matrix \(\Sigma \in \mathbb{R}^{m \times n}\) such that:
\[A = U \Sigma V^T\]
The columns of \(U\) are called the left singular vectors of \(A\). They form an orthonormal basis for \(\mathbb{R}^m\). The columns of \(V\) are called the right singular vectors of \(A\). They form an orthonormal basis for \(\mathbb{R}^n\). The matrix \(\Sigma\) is diagonal — meaning its only non-zero entries are on the main diagonal — with non-negative real values \(\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0\). These diagonal entries are called the singular values of \(A\).
To read this decomposition geometrically: the matrix \(V^T\) rotates (or reflects) vectors in \(\mathbb{R}^n\) so that they align with the coordinate axes; \(\Sigma\) then stretches each coordinate axis by the corresponding singular value (and maps from \(\mathbb{R}^n\) to \(\mathbb{R}^m\)); finally \(U\) rotates (or reflects) in \(\mathbb{R}^m\) to produce the output. Every linear map can be decomposed into this sequence: rotate–stretch–rotate.
1.2.1 Shape of \(\Sigma\) for Different Matrix Sizes
The matrix \(\Sigma\) has the same shape as \(A\) — that is, \(\Sigma \in \mathbb{R}^{m \times n}\) — but all entries outside the main diagonal are zero. Its diagonal entries are the singular values.
- Tall matrix (\(m > n\)): \(\Sigma\) looks like a diagonal \(n \times n\) block stacked on top of an \((m - n) \times n\) zero block. There are at most \(n\) singular values.
- Wide matrix (\(m < n\)): \(\Sigma\) looks like a diagonal \(m \times m\) block followed by an \(m \times (n - m)\) zero block to the right. There are at most \(m\) singular values.
- Square matrix (\(m = n\)): \(\Sigma\) is an \(n \times n\) diagonal matrix.
In all cases, at most \(r = \min(m, n)\) singular values can be non-zero, and the rank of \(A\) equals the number of strictly positive singular values.
1.3 Properties of Singular Values
Singular values have several important properties that make them a robust measure of the “size” and “complexity” of a matrix.
Non-negativity and ordering: By convention, singular values are always listed in non-increasing order: \(\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0 = \sigma_{r+1} = \cdots\). They are always non-negative real numbers regardless of the entries of \(A\).
Rank: The rank of \(A\) equals the number of non-zero singular values. This gives an efficient way to determine rank by computing the SVD.
Connection to matrix norms: The largest singular value \(\sigma_1\) equals the operator (spectral) norm \(\|A\|_2 = \max_{\|x\|=1} \|Ax\|\). The Frobenius norm satisfies \(\|A\|_F = \sqrt{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_r^2}\).
Uniqueness: The singular values of \(A\) are unique (they depend only on \(A\), not on any particular factorization). The singular vectors, however, are not unique when singular values are repeated.
1.4 Computing the SVD
The practical algorithm for computing the SVD relies on a deep connection between \(A\) and the symmetric matrices \(A^TA\) and \(AA^T\).
1.4.1 Connection to \(A^TA\) and \(AA^T\)
Let \(A = U\Sigma V^T\). Then:
\[A^TA = (U\Sigma V^T)^T(U\Sigma V^T) = V\Sigma^T U^T U \Sigma V^T = V(\Sigma^T\Sigma)V^T\]
This is exactly the eigendecomposition of \(A^TA\): the columns of \(V\) are the eigenvectors of \(A^TA\), and the eigenvalues of \(A^TA\) are \(\sigma_i^2\) (the squares of the singular values).
Similarly:
\[AA^T = U\Sigma V^T (U\Sigma V^T)^T = U\Sigma V^T V \Sigma^T U^T = U(\Sigma\Sigma^T)U^T\]
So the columns of \(U\) are the eigenvectors of \(AA^T\), and the eigenvalues of \(AA^T\) are also \(\sigma_i^2\).
This gives us the key equations:
\[A^T A\, \mathbf{v}_i = \sigma_i^2\, \mathbf{v}_i, \qquad \mathbf{u}_i = \frac{A\mathbf{v}_i}{\sigma_i}\]
The right singular vectors \(\mathbf{v}_i\) are the unit eigenvectors of \(A^TA\); the left singular vectors \(\mathbf{u}_i\) for \(i \leq r\) are obtained by applying \(A\) to \(\mathbf{v}_i\) and normalizing. The remaining columns of \(U\) (when \(r < m\)) are chosen as any orthonormal basis for the null space of \(A^T\).
1.4.2 Step-by-Step Algorithm
Given \(A \in \mathbb{R}^{m \times n}\):
- Compute \(A^TA \in \mathbb{R}^{n \times n}\). This is symmetric and positive semi-definite.
- Find the eigenvalues \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n \geq 0\) of \(A^TA\). These satisfy \(\lambda_i \geq 0\) because \(A^TA\) is PSD.
- Extract singular values: \(\sigma_i = \sqrt{\lambda_i}\) for \(i = 1, \ldots, r\), where \(r\) is the number of positive eigenvalues.
- Find unit eigenvectors of \(A^TA\) to form the columns \(\mathbf{v}_1, \ldots, \mathbf{v}_n\) of \(V\).
- Compute left singular vectors: \(\mathbf{u}_i = \dfrac{A\mathbf{v}_i}{\sigma_i}\) for \(i = 1, \ldots, r\).
- Extend \(U\): complete \(\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}\) to a full orthonormal basis of \(\mathbb{R}^m\) (use the null space of \(A^T\) for the remaining vectors).
- Assemble \(U\), \(\Sigma\), \(V\) and verify \(A = U\Sigma V^T\).
1.4.3 Special Cases
Rank-1 matrix: If \(A\) has rank 1, then \(A = \mathbf{u}\mathbf{v}^T\) (outer product of two unit vectors scaled by a constant). The only non-zero singular value is the norm of the row (or column) — all other singular values are zero.
Symmetric positive semi-definite matrix: If \(A = A^T\) and all eigenvalues \(\lambda_i \geq 0\), then the SVD coincides with the eigendecomposition: \(\sigma_i = \lambda_i\) and \(U = V = Q\) (the matrix of eigenvectors). Computing the SVD is then simply the same as diagonalizing the matrix.
1.5 Algebraic Properties of the SVD
Given \(A = U\Sigma V^T\), we can immediately read off the SVD of related matrices:
Transpose: \(A^T = V\Sigma^T U^T\). The roles of \(U\) and \(V\) are simply swapped. The singular values remain the same — \(A\) and \(A^T\) always have the same singular values.
Inverse (square, full-rank case): If \(A\) is \(n \times n\) and invertible, then: \[A^{-1} = V\Sigma^{-1}U^T, \quad \text{where } \Sigma^{-1} = \text{diag}(1/\sigma_1, \ldots, 1/\sigma_n)\]
This is immediate because \((V\Sigma^{-1}U^T)(U\Sigma V^T) = V\Sigma^{-1}\Sigma V^T = VV^T = I\).
Scalar multiple: For any scalar \(c\): \[(cA) = U(c\Sigma)V^T\]
The singular values of \(cA\) are \(|c|\sigma_1, |c|\sigma_2, \ldots\). The singular vectors \(U\) and \(V\) are unchanged (they do not depend on scaling). If \(c < 0\), the singular values are still non-negative, since the sign can be absorbed into \(U\) or \(V\).
1.6 The Four Fundamental Subspaces via SVD
The SVD provides a complete and elegant description of the four fundamental subspaces of a matrix. This is one of the most important applications of the decomposition.
Recall that for \(A \in \mathbb{R}^{m \times n}\) with rank \(r\):
- The column space \(C(A) \subseteq \mathbb{R}^m\) has dimension \(r\).
- The left null space \(N(A^T) \subseteq \mathbb{R}^m\) has dimension \(m - r\).
- The row space \(C(A^T) \subseteq \mathbb{R}^n\) has dimension \(r\).
- The null space \(N(A) \subseteq \mathbb{R}^n\) has dimension \(n - r\).
From the SVD \(A = U\Sigma V^T\), we read off these subspaces directly:
- \(C(A)\): spanned by the first \(r\) columns of \(U\): \(\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}\).
- \(N(A^T)\): spanned by the last \(m - r\) columns of \(U\): \(\{\mathbf{u}_{r+1}, \ldots, \mathbf{u}_m\}\).
- \(C(A^T)\): spanned by the first \(r\) columns of \(V\): \(\{\mathbf{v}_1, \ldots, \mathbf{v}_r\}\).
- \(N(A)\): spanned by the last \(n - r\) columns of \(V\): \(\{\mathbf{v}_{r+1}, \ldots, \mathbf{v}_n\}\).
This is the Fundamental Theorem of Linear Algebra: the four subspaces are organized into two orthogonal pairs: \(C(A) \perp N(A^T)\) in \(\mathbb{R}^m\), and \(C(A^T) \perp N(A)\) in \(\mathbb{R}^n\). The columns of \(U\) form an orthonormal basis for \(\mathbb{R}^m\) that simultaneously respects both \(C(A)\) and \(N(A^T)\); the columns of \(V\) form an orthonormal basis for \(\mathbb{R}^n\) that simultaneously respects both \(C(A^T)\) and \(N(A)\).
Why this works: \(A\mathbf{v}_i = \sigma_i \mathbf{u}_i\) for \(i \leq r\), so the first \(r\) columns of \(V\) map directly onto \(r\) independent vectors in \(C(A)\). For \(i > r\), \(A\mathbf{v}_i = \mathbf{0}\), confirming that \(\mathbf{v}_{r+1}, \ldots, \mathbf{v}_n\) lie in \(N(A)\).
1.7 SVD as a Sum of Rank-1 Matrices
Expanding the product \(A = U\Sigma V^T\) column by column yields a beautiful representation:
\[A = \sum_{i=1}^{r} \sigma_i\, \mathbf{u}_i \mathbf{v}_i^T\]
Each term \(\sigma_i \mathbf{u}_i \mathbf{v}_i^T\) is a rank-1 matrix — the outer product of two vectors, scaled by the \(i\)-th singular value. This decomposition organizes the “information” in \(A\) by importance: the first term \(\sigma_1 \mathbf{u}_1 \mathbf{v}_1^T\) is the most important (it contributes the most to the matrix, since \(\sigma_1\) is largest), and successive terms are progressively less significant.
This information hierarchy is the key insight behind low-rank approximation: if we drop the small terms, we get a matrix that is “almost” \(A\) but has much lower rank — and hence can be stored and processed far more efficiently.
1.8 Applications: Low-Rank Approximation, Image Compression, and Denoising
1.8.1 Frobenius Norm and Energy
The Frobenius norm of a matrix \(A\) is:
\[\|A\|_F = \sqrt{\sum_{i,j} a_{ij}^2} = \sqrt{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_r^2}\]
The second equality follows directly from the SVD. The squared Frobenius norm can be thought of as the total energy of the matrix. The fraction of energy captured by the top \(k\) singular values is:
\[\text{Energy fraction} = \frac{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_k^2}{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_r^2}\]
This ratio is fundamental in deciding how many terms to keep in a low-rank approximation.
1.8.2 The Eckart–Young Theorem
The rank-\(k\) approximation of \(A\) is:
\[A_k = \sum_{i=1}^{k} \sigma_i\, \mathbf{u}_i \mathbf{v}_i^T\]
This is obtained by keeping only the top \(k\) terms in the rank-1 expansion and discarding the rest.
Theorem (Eckart–Young, 1936): Among all matrices \(B\) of rank at most \(k\), the best approximation to \(A\) in the Frobenius norm (and also in the spectral norm) is \(A_k\):
\[\|A - A_k\|_F = \min_{\text{rank}(B) \leq k} \|A - B\|_F = \sqrt{\sigma_{k+1}^2 + \sigma_{k+2}^2 + \cdots + \sigma_r^2}\]
This theorem guarantees that the SVD rank-\(k\) truncation is optimal — no other rank-\(k\) matrix gets closer to \(A\). The approximation error is precisely the energy of the discarded singular values.
1.8.3 Image Compression
A grayscale image of size \(m \times n\) pixels is naturally a matrix \(A \in \mathbb{R}^{m \times n}\). Storing the full matrix requires \(mn\) numbers. Using the rank-\(k\) SVD approximation, we store:
- \(k\) singular values: \(k\) numbers
- \(k\) left singular vectors: \(mk\) numbers
- \(k\) right singular vectors: \(nk\) numbers
Total storage cost: \(k(m + n + 1)\) numbers, compared to \(mn\) for the full matrix. The compression ratio is \(k(m + n + 1) / (mn)\).
For small \(k\), this can achieve dramatic compression while retaining most of the image structure, because natural images tend to have rapidly decaying singular values — most of the energy is concentrated in the first few singular values.
1.8.4 Denoising and PCA
Beyond compression, the SVD is central to Principal Component Analysis (PCA): the right singular vectors \(\mathbf{v}_i\) (or equivalently, the eigenvectors of \(A^TA\)) are the principal components — the directions of maximum variance in the data. Projecting data onto the top-\(k\) principal components extracts the most informative features.
In denoising, we model a noisy matrix as \(A = A_{\text{signal}} + A_{\text{noise}}\), where the signal has low rank and the noise is spread across all singular values. Truncating the SVD to the top few singular values removes the noise while preserving the signal.
2. Definitions
- Singular Value Decomposition (SVD): A factorization \(A = U\Sigma V^T\) of any real matrix \(A \in \mathbb{R}^{m \times n}\), where \(U \in \mathbb{R}^{m \times m}\) and \(V \in \mathbb{R}^{n \times n}\) are orthogonal matrices and \(\Sigma \in \mathbb{R}^{m \times n}\) has non-negative entries only on the main diagonal.
- Left singular vectors: The columns \(\mathbf{u}_1, \ldots, \mathbf{u}_m\) of \(U\) in the SVD; they form an orthonormal basis of \(\mathbb{R}^m\).
- Right singular vectors: The columns \(\mathbf{v}_1, \ldots, \mathbf{v}_n\) of \(V\) in the SVD; they form an orthonormal basis of \(\mathbb{R}^n\).
- Singular values: The non-negative diagonal entries \(\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0\) of \(\Sigma\); they satisfy \(\sigma_i = \sqrt{\lambda_i(A^TA)}\).
- Rank (from SVD): The rank of \(A\) equals the number of strictly positive singular values of \(A\).
- Column space \(C(A)\): The subspace of \(\mathbb{R}^m\) spanned by the columns of \(A\); in the SVD, spanned by \(\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}\).
- Null space \(N(A)\): The subspace of \(\mathbb{R}^n\) consisting of all \(\mathbf{x}\) with \(A\mathbf{x} = \mathbf{0}\); in the SVD, spanned by \(\{\mathbf{v}_{r+1}, \ldots, \mathbf{v}_n\}\).
- Row space \(C(A^T)\): The column space of \(A^T\); equivalently, the subspace of \(\mathbb{R}^n\) spanned by the rows of \(A\); in the SVD, spanned by \(\{\mathbf{v}_1, \ldots, \mathbf{v}_r\}\).
- Left null space \(N(A^T)\): The null space of \(A^T\); a subspace of \(\mathbb{R}^m\); in the SVD, spanned by \(\{\mathbf{u}_{r+1}, \ldots, \mathbf{u}_m\}\).
- Frobenius norm: \(\|A\|_F = \sqrt{\sum_{i,j} a_{ij}^2} = \sqrt{\sum_i \sigma_i^2}\); a measure of the total “size” of a matrix.
- Rank-\(k\) approximation: The matrix \(A_k = \sum_{i=1}^k \sigma_i \mathbf{u}_i \mathbf{v}_i^T\), obtained by keeping only the \(k\) largest singular value terms in the SVD expansion.
- Eckart–Young theorem: The rank-\(k\) approximation \(A_k\) is the closest matrix of rank at most \(k\) to \(A\) in both the Frobenius norm and the spectral norm; the approximation error is \(\|A - A_k\|_F = \sqrt{\sum_{i > k} \sigma_i^2}\).
- Outer product (rank-1 matrix): The matrix \(\mathbf{u}\mathbf{v}^T \in \mathbb{R}^{m \times n}\) formed from column vectors \(\mathbf{u} \in \mathbb{R}^m\) and \(\mathbf{v} \in \mathbb{R}^n\); always has rank at most 1.
3. Formulas
- SVD factorization: \(A = U\Sigma V^T\), where \(U^TU = I_m\), \(V^TV = I_n\), \(\Sigma_{ij} = 0\) for \(i \neq j\), \(\Sigma_{ii} = \sigma_i \geq 0\).
- Right singular vectors from \(A^TA\): \(A^T A\, \mathbf{v}_i = \sigma_i^2\, \mathbf{v}_i\) (eigenvectors of \(A^TA\)).
- Left singular vectors from right: \(\mathbf{u}_i = \dfrac{A\mathbf{v}_i}{\sigma_i}\) for \(i = 1, \ldots, r\).
- SVD of transpose: \(A^T = V\Sigma^T U^T\).
- SVD of inverse (square, invertible): \(A^{-1} = V\Sigma^{-1}U^T\).
- SVD of scalar multiple: \(cA = U(c\Sigma)V^T\); singular values of \(cA\) are \(|c|\sigma_i\).
- Rank-1 expansion: \(A = \displaystyle\sum_{i=1}^{r} \sigma_i\, \mathbf{u}_i \mathbf{v}_i^T\).
- Best rank-\(k\) approximation (Eckart–Young): \(A_k = \displaystyle\sum_{i=1}^{k} \sigma_i\, \mathbf{u}_i \mathbf{v}_i^T\).
- Frobenius approximation error: \(\|A - A_k\|_F^2 = \sigma_{k+1}^2 + \sigma_{k+2}^2 + \cdots + \sigma_r^2\).
- Frobenius norm via singular values: \(\|A\|_F = \sqrt{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_r^2}\).
- Storage cost of rank-\(k\) SVD: \(k(m + n + 1)\) numbers.
- Energy fraction of top \(k\) singular values: \(\dfrac{\displaystyle\sum_{i=1}^{k} \sigma_i^2}{\displaystyle\sum_{i=1}^{r} \sigma_i^2}\).
4. Examples
4.1 Compute \(A^TA\), Singular Values, and Rank (Lab 8, Task 1)
Let \(A = \begin{bmatrix} 3 & 0 \\ 4 & 0 \end{bmatrix}\). Compute \(A^TA\), find all singular values, determine the rank, and give a geometric interpretation of the action of \(A\).
Click to see the solution
Key Concept: We compute \(A^TA\), find its eigenvalues, take square roots to get singular values, and then interpret the result geometrically. A zero singular value signals rank deficiency and a collapsed image.
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix} 3 & 4 \\ 0 & 0 \end{bmatrix}\begin{bmatrix} 3 & 0 \\ 4 & 0 \end{bmatrix} = \begin{bmatrix} 3\cdot3+4\cdot4 & 3\cdot0+4\cdot0 \\ 0\cdot3+0\cdot4 & 0\cdot0+0\cdot0 \end{bmatrix} = \begin{bmatrix} 25 & 0 \\ 0 & 0 \end{bmatrix}\]
Step 2 — Find eigenvalues of \(A^TA\).
Since \(A^TA = \text{diag}(25, 0)\) is diagonal, the eigenvalues are read directly: \[\lambda_1 = 25, \qquad \lambda_2 = 0\]
Step 3 — Extract singular values.
\[\sigma_1 = \sqrt{25} = 5, \qquad \sigma_2 = \sqrt{0} = 0\]
The rank equals the number of strictly positive singular values: \(\text{rank}(A) = 1\).
Step 4 — Geometric interpretation.
Because rank \(= 1\), the matrix \(A\) collapses the entire plane \(\mathbb{R}^2\) onto a single line in \(\mathbb{R}^2\). Concretely, \(A\begin{bmatrix}x\\y\end{bmatrix} = x\begin{bmatrix}3\\4\end{bmatrix}\), so the image is the line spanned by \((3,4)^T\), which in unit-vector form is \(\mathbf{u}_1 = \tfrac{1}{5}(3,4)^T\). The maximum stretch of a unit input vector is \(\sigma_1 = 5\). The unit circle is mapped not to an ellipse but to a line segment of length \(2\sigma_1 = 10\), because the second singular value is zero.
Answer: \(A^TA = \begin{bmatrix}25&0\\0&0\end{bmatrix}\), singular values \(\sigma_1 = 5\), \(\sigma_2 = 0\), \(\text{rank}(A) = 1\). Geometrically, \(A\) collapses the plane onto the line spanned by \(\tfrac{1}{5}(3,4)^T\), stretching by a factor of \(5\).
4.2 Compute the Full SVD (Lab 8, Task 2)
Compute the full SVD of \(A = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}\).
Click to see the solution
Key Concept: We follow the standard SVD algorithm: compute \(A^TA\), diagonalize it to get \(V\) and \(\Sigma\), then compute left singular vectors via \(\mathbf{u}_i = A\mathbf{v}_i / \sigma_i\).
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}^T\begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} 2 & 1 \\ 1 & 1 \end{bmatrix}\]
Step 2 — Find eigenvalues of \(A^TA\).
\[\det(A^TA - \lambda I) = (2-\lambda)(1-\lambda) - 1 = \lambda^2 - 3\lambda + 1 = 0\] \[\lambda = \frac{3 \pm \sqrt{9-4}}{2} = \frac{3 \pm \sqrt{5}}{2}\]
So \(\lambda_1 = \dfrac{3+\sqrt{5}}{2} \approx 2.618\) and \(\lambda_2 = \dfrac{3-\sqrt{5}}{2} \approx 0.382\).
Step 3 — Extract singular values.
\[\sigma_1 = \sqrt{\frac{3+\sqrt{5}}{2}} \approx 1.618, \qquad \sigma_2 = \sqrt{\frac{3-\sqrt{5}}{2}} \approx 0.618\]
Step 4 — Find right singular vectors (columns of \(V\)).
For \(\lambda_1 = \dfrac{3+\sqrt{5}}{2}\): solve \((A^TA - \lambda_1 I)\mathbf{v} = 0\).
\[\left(2 - \frac{3+\sqrt{5}}{2}\right)v_1 + v_2 = 0 \implies \frac{1-\sqrt{5}}{2}v_1 + v_2 = 0 \implies v_2 = \frac{\sqrt{5}-1}{2}v_1\]
Take \(v_1\) proportional to \((2, \sqrt{5}-1)^T\). Its norm is: \[\|(2, \sqrt{5}-1)^T\| = \sqrt{4 + (\sqrt{5}-1)^2} = \sqrt{4 + 6 - 2\sqrt{5}} = \sqrt{10-2\sqrt{5}}\]
\[\mathbf{v}_1 = \frac{1}{\sqrt{10-2\sqrt{5}}}\begin{bmatrix}2\\\sqrt{5}-1\end{bmatrix}\]
For \(\lambda_2\): \(\mathbf{v}_2\) is orthogonal to \(\mathbf{v}_1\):
\[\mathbf{v}_2 = \frac{1}{\sqrt{10-2\sqrt{5}}}\begin{bmatrix}1-\sqrt{5}\\2\end{bmatrix}\]
Step 5 — Compute left singular vectors.
\[A\mathbf{v}_1 = \frac{1}{\sqrt{10-2\sqrt{5}}}\begin{bmatrix}1&1\\1&0\end{bmatrix}\begin{bmatrix}2\\\sqrt{5}-1\end{bmatrix} = \frac{1}{\sqrt{10-2\sqrt{5}}}\begin{bmatrix}1+\sqrt{5}\\2\end{bmatrix}\]
\[\|(1+\sqrt{5}, 2)^T\| = \sqrt{(1+\sqrt{5})^2+4} = \sqrt{10+2\sqrt{5}}\]
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{(1+\sqrt{5}, 2)^T}{\sqrt{10+2\sqrt{5}}}\]
Similarly, \(A\mathbf{v}_2 = \tfrac{1}{\sqrt{10-2\sqrt{5}}}(3-\sqrt{5},\, 1-\sqrt{5})^T\), giving:
\[\mathbf{u}_2 = \frac{(3-\sqrt{5},\, 1-\sqrt{5})^T}{\sqrt{(3-\sqrt{5})^2+(1-\sqrt{5})^2}/\sigma_2\sqrt{10-2\sqrt{5}}}\]
Numerically: \(\sigma_1\approx 1.618\), \(\sigma_2\approx 0.618\), \(\mathbf{v}_1\approx(0.526, 0.851)^T\), \(\mathbf{v}_2\approx(-0.851, 0.526)^T\), \(\mathbf{u}_1\approx(0.851, 0.526)^T\), \(\mathbf{u}_2\approx(-0.526, 0.851)^T\).
Answer: \(A = U\Sigma V^T\) with \(\sigma_1 = \sqrt{(3+\sqrt{5})/2} \approx 1.618\), \(\sigma_2 = \sqrt{(3-\sqrt{5})/2} \approx 0.618\), \(V = \tfrac{1}{\sqrt{10-2\sqrt{5}}}\begin{bmatrix}2 & 1-\sqrt{5}\\\sqrt{5}-1 & 2\end{bmatrix}\), \(U = \tfrac{1}{\sqrt{10+2\sqrt{5}}}\begin{bmatrix}1+\sqrt{5} & \sqrt{5}-3\\2 & \sqrt{5}-1\end{bmatrix}\).
4.3 SVD of \(A^T\) and \(A^{-1}\); Effect of Scaling (Lab 8, Task 3)
Using the SVD of \(A\) found in Task 4.2: (a) write down the SVD of \(A^T\); (b) find \(A^{-1}\) from the SVD; (c) determine how the singular values change if \(A\) is replaced by \(4A\).
Click to see the solution
Key Concept: The SVD formulas \(A^T = V\Sigma^T U^T\), \(A^{-1} = V\Sigma^{-1}U^T\), and \((cA) = U(c\Sigma)V^T\) let us read off the SVD of related matrices directly from the SVD of \(A\), without new computation.
Throughout, \(U\), \(\Sigma\), \(V\) refer to the SVD of \(A\) found in Task 4.2, with \(\sigma_1 = \sqrt{(3+\sqrt{5})/2}\) and \(\sigma_2 = \sqrt{(3-\sqrt{5})/2}\).
Part (a) — SVD of \(A^T\).
Step 1: Use the transpose formula. If \(A = U\Sigma V^T\), then \(A^T = (U\Sigma V^T)^T = V\Sigma^T U^T\).
Step 2: Since \(A\) is \(2\times 2\) and square, \(\Sigma^T = \Sigma = \text{diag}(\sigma_1, \sigma_2)\).
Step 3: Therefore \(A^T = V \Sigma U^T\). The left singular vectors of \(A^T\) are the columns of \(V\) (right singular vectors of \(A\)), and the right singular vectors of \(A^T\) are the columns of \(U\) (left singular vectors of \(A\)). The singular values are the same: \(\sigma_1\) and \(\sigma_2\).
Part (b) — SVD of \(A^{-1}\).
Step 1: Since \(A\) is \(2\times 2\) with both singular values non-zero (\(\sigma_1, \sigma_2 > 0\)), \(A\) is invertible.
Step 2: Use the inverse formula: \(A^{-1} = V\Sigma^{-1}U^T\), where \(\Sigma^{-1} = \text{diag}(1/\sigma_1,\, 1/\sigma_2)\).
\[A^{-1} = V \begin{bmatrix} 1/\sigma_1 & 0 \\ 0 & 1/\sigma_2 \end{bmatrix} U^T = V \begin{bmatrix} \sqrt{2/(3+\sqrt{5})} & 0 \\ 0 & \sqrt{2/(3-\sqrt{5})} \end{bmatrix} U^T\]
The singular values of \(A^{-1}\) are \(1/\sigma_2 \geq 1/\sigma_1\) (in decreasing order), approximately \(1/0.618 \approx 1.618\) and \(1/1.618 \approx 0.618\).
Part (c) — Effect of replacing \(A\) by \(4A\).
Step 1: Use the scalar multiple formula: \(4A = U(4\Sigma)V^T\).
Step 2: The matrices \(U\) and \(V\) are unchanged. The singular values each multiply by \(4\): \[\sigma_1(4A) = 4\sigma_1 = 4\sqrt{\frac{3+\sqrt{5}}{2}} \approx 6.472, \qquad \sigma_2(4A) = 4\sigma_2 = 4\sqrt{\frac{3-\sqrt{5}}{2}} \approx 2.472\]
Answer: (a) \(A^T = V\Sigma U^T\) (same \(\Sigma\), roles of \(U\) and \(V\) swapped, same singular values). (b) \(A^{-1} = V\Sigma^{-1}U^T\) with singular values \(1/\sigma_2 \approx 1.618\) and \(1/\sigma_1 \approx 0.618\). (c) The singular values of \(4A\) are \(4\sigma_1 \approx 6.472\) and \(4\sigma_2 \approx 2.472\); \(U\) and \(V\) are unchanged.
4.4 Full SVD and Four Fundamental Subspaces (Lab 8, Task 4)
Let \(A = \begin{bmatrix} 1 & 2 \\ 3 & 6 \end{bmatrix}\). Compute the full SVD of \(A\) and identify all four fundamental subspaces.
Click to see the solution
Key Concept: Column 2 of \(A\) equals \(2 \times\) column 1, so \(A\) has rank 1. A rank-1 matrix has exactly one non-zero singular value, and its four fundamental subspaces are each 1-dimensional (or trivial).
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix}1&3\\2&6\end{bmatrix}\begin{bmatrix}1&2\\3&6\end{bmatrix} = \begin{bmatrix}1+9&2+18\\2+18&4+36\end{bmatrix} = \begin{bmatrix}10&20\\20&40\end{bmatrix}\]
Step 2 — Find eigenvalues of \(A^TA\).
\[\det\begin{bmatrix}10-\lambda&20\\20&40-\lambda\end{bmatrix} = (10-\lambda)(40-\lambda) - 400 = \lambda^2 - 50\lambda = \lambda(\lambda-50) = 0\]
So \(\lambda_1 = 50\), \(\lambda_2 = 0\).
Step 3 — Extract singular values.
\[\sigma_1 = \sqrt{50} = 5\sqrt{2}, \qquad \sigma_2 = 0, \qquad \text{rank}(A) = 1\]
Step 4 — Find right singular vectors (columns of \(V\)).
For \(\lambda_1 = 50\): \((A^TA - 50I)\mathbf{v} = 0 \Rightarrow -40v_1 + 20v_2 = 0 \Rightarrow v_2 = 2v_1\). So \(\mathbf{v} \propto (1,2)^T\), normalized: \[\mathbf{v}_1 = \frac{1}{\sqrt{5}}\begin{bmatrix}1\\2\end{bmatrix}\]
For \(\lambda_2 = 0\): \(\mathbf{v}_2\) is orthogonal to \(\mathbf{v}_1\): \[\mathbf{v}_2 = \frac{1}{\sqrt{5}}\begin{bmatrix}-2\\1\end{bmatrix}\]
\[V = \frac{1}{\sqrt{5}}\begin{bmatrix}1&-2\\2&1\end{bmatrix}\]
Step 5 — Compute left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{5\sqrt{2}} \cdot \frac{1}{\sqrt{5}}\begin{bmatrix}1&2\\3&6\end{bmatrix}\begin{bmatrix}1\\2\end{bmatrix} = \frac{1}{5\sqrt{10}}\begin{bmatrix}5\\15\end{bmatrix} = \frac{1}{\sqrt{10}}\begin{bmatrix}1\\3\end{bmatrix}\]
For \(\mathbf{u}_2\): choose any unit vector orthogonal to \(\mathbf{u}_1\) in \(\mathbb{R}^2\): \[\mathbf{u}_2 = \frac{1}{\sqrt{10}}\begin{bmatrix}-3\\1\end{bmatrix}\]
\[U = \frac{1}{\sqrt{10}}\begin{bmatrix}1&-3\\3&1\end{bmatrix}\]
Step 6 — Assemble SVD.
\[A = U\Sigma V^T = \frac{1}{\sqrt{10}}\begin{bmatrix}1&-3\\3&1\end{bmatrix}\begin{bmatrix}5\sqrt{2}&0\\0&0\end{bmatrix}\frac{1}{\sqrt{5}}\begin{bmatrix}1&2\\-2&1\end{bmatrix}\]
Step 7 — Four fundamental subspaces.
- \(C(A) = \text{span}\{\mathbf{u}_1\} = \text{span}\left\{\tfrac{1}{\sqrt{10}}(1,3)^T\right\}\)
- \(N(A^T) = \text{span}\{\mathbf{u}_2\} = \text{span}\left\{\tfrac{1}{\sqrt{10}}(-3,1)^T\right\}\)
- \(C(A^T) = \text{span}\{\mathbf{v}_1\} = \text{span}\left\{\tfrac{1}{\sqrt{5}}(1,2)^T\right\}\)
- \(N(A) = \text{span}\{\mathbf{v}_2\} = \text{span}\left\{\tfrac{1}{\sqrt{5}}(-2,1)^T\right\}\)
Answer: \(\sigma_1 = 5\sqrt{2}\), \(\sigma_2 = 0\), rank \(= 1\). \(U = \tfrac{1}{\sqrt{10}}\begin{bmatrix}1&-3\\3&1\end{bmatrix}\), \(\Sigma = \begin{bmatrix}5\sqrt{2}&0\\0&0\end{bmatrix}\), \(V = \tfrac{1}{\sqrt{5}}\begin{bmatrix}1&-2\\2&1\end{bmatrix}\).
4.5 SVD of a Symmetric PSD Matrix (Lab 8, Task 5)
Let \(A\) be a \(3 \times 3\) symmetric positive semi-definite matrix with eigenvalues \(9, 4, 0\) and corresponding orthonormal eigenvectors \(\mathbf{q}_1, \mathbf{q}_2, \mathbf{q}_3\). Write down the SVD of \(A\), identify all four fundamental subspaces, and explain why this case is simpler than the general SVD.
Click to see the solution
Key Concept: For a symmetric positive semi-definite (PSD) matrix, the eigendecomposition \(A = Q\Lambda Q^T\) is already the SVD, because \(U = V = Q\) and the singular values equal the eigenvalues. No separate computation of \(A^TA\) is required.
Step 1 — Eigendecomposition.
Let \(Q = [\mathbf{q}_1\; \mathbf{q}_2\; \mathbf{q}_3]\) be the matrix of orthonormal eigenvectors and \(\Lambda = \text{diag}(9, 4, 0)\). Since \(A\) is symmetric PSD: \[A = Q\Lambda Q^T = 9\,\mathbf{q}_1\mathbf{q}_1^T + 4\,\mathbf{q}_2\mathbf{q}_2^T + 0\,\mathbf{q}_3\mathbf{q}_3^T\]
Step 2 — Write the SVD.
Comparing with \(A = U\Sigma V^T\): set \(U = V = Q\) and \(\Sigma = \Lambda = \text{diag}(9, 4, 0)\).
\[A = Q\begin{bmatrix}9&0&0\\0&4&0\\0&0&0\end{bmatrix}Q^T, \qquad \sigma_1 = 9,\; \sigma_2 = 4,\; \sigma_3 = 0\]
Step 3 — Rank.
The number of non-zero singular values is 2, so \(\text{rank}(A) = 2\).
Step 4 — Four fundamental subspaces.
Since \(A = A^T\) is symmetric, \(C(A) = C(A^T)\) and \(N(A) = N(A^T)\).
- \(C(A) = C(A^T) = \text{span}\{\mathbf{q}_1, \mathbf{q}_2\}\) (dimension 2)
- \(N(A) = N(A^T) = \text{span}\{\mathbf{q}_3\}\) (dimension 1)
Step 5 — Why this case is simpler.
In the general SVD algorithm, we must: (i) compute \(A^TA\) (a separate matrix product), (ii) find its eigenvectors to form \(V\), (iii) separately compute each \(\mathbf{u}_i = A\mathbf{v}_i/\sigma_i\), and (iv) extend \(U\) using the null space of \(A^T\). For a symmetric PSD matrix, steps (iii) and (iv) are automatic because \(U = V = Q\). Furthermore, \(A^TA = A^2 = Q\Lambda^2 Q^T\), so the eigenvectors of \(A^TA\) are the same as those of \(A\) — no new computation is needed. The SVD reduces entirely to the eigendecomposition, which we already know how to perform.
Answer: SVD is \(A = Q\,\text{diag}(9,4,0)\,Q^T\); \(\sigma_1 = 9\), \(\sigma_2 = 4\), \(\sigma_3 = 0\); rank \(= 2\); \(C(A) = C(A^T) = \text{span}\{\mathbf{q}_1, \mathbf{q}_2\}\); \(N(A) = N(A^T) = \text{span}\{\mathbf{q}_3\}\). The simplification is that \(U = V = Q\), so the SVD is identical to the eigendecomposition.
4.6 Best Rank-1 and Rank-2 Approximations (Lab 8, Task 6)
Suppose \(A = 12\,\mathbf{u}_1\mathbf{v}_1^T + 5\,\mathbf{u}_2\mathbf{v}_2^T + \mathbf{u}_3\mathbf{v}_3^T\) is already given in its SVD rank-1 expansion. Find the best rank-1 approximation \(A_1\), the best rank-2 approximation \(A_2\), and compute the Frobenius errors \(\|A - A_1\|_F\) and \(\|A - A_2\|_F\).
Click to see the solution
Key Concept: By the Eckart–Young theorem, the best rank-\(k\) approximation is obtained by keeping the \(k\) largest singular value terms in the SVD expansion. The Frobenius error equals the square root of the sum of squares of the discarded singular values.
The singular values are \(\sigma_1 = 12\), \(\sigma_2 = 5\), \(\sigma_3 = 1\) (decreasing order, as required for SVD).
Step 1 — Best rank-1 approximation \(A_1\).
Keep only the term with the largest singular value: \[A_1 = \sigma_1\,\mathbf{u}_1\mathbf{v}_1^T = 12\,\mathbf{u}_1\mathbf{v}_1^T\]
Step 2 — Best rank-2 approximation \(A_2\).
Keep the two largest terms: \[A_2 = \sigma_1\,\mathbf{u}_1\mathbf{v}_1^T + \sigma_2\,\mathbf{u}_2\mathbf{v}_2^T = 12\,\mathbf{u}_1\mathbf{v}_1^T + 5\,\mathbf{u}_2\mathbf{v}_2^T\]
Step 3 — Frobenius error \(\|A - A_1\|_F\).
By Eckart–Young, the discarded singular values are \(\sigma_2 = 5\) and \(\sigma_3 = 1\): \[\|A - A_1\|_F = \sqrt{\sigma_2^2 + \sigma_3^2} = \sqrt{25 + 1} = \sqrt{26} \approx 5.10\]
Step 4 — Frobenius error \(\|A - A_2\|_F\).
Only \(\sigma_3 = 1\) is discarded: \[\|A - A_2\|_F = \sqrt{\sigma_3^2} = \sqrt{1} = 1\]
Step 5 — Interpretation.
The rank-1 approximation retains \(\sigma_1^2/(\sigma_1^2+\sigma_2^2+\sigma_3^2) = 144/170 \approx 84.7\%\) of total energy. The rank-2 approximation retains \(169/170 \approx 99.4\%\) of total energy. Adding the second singular-value term reduces the error from \(\sqrt{26}\approx 5.10\) to \(1\).
Answer: \(A_1 = 12\,\mathbf{u}_1\mathbf{v}_1^T\), \(A_2 = 12\,\mathbf{u}_1\mathbf{v}_1^T + 5\,\mathbf{u}_2\mathbf{v}_2^T\); \(\|A-A_1\|_F = \sqrt{26} \approx 5.10\); \(\|A-A_2\|_F = 1\).
4.7 Energy Retention and Storage Comparison (Lab 8, Task 7)
A matrix \(M \in \mathbb{R}^{1000 \times 800}\) has singular values \(\sigma_1 = 120\), \(\sigma_2 = 40\), \(\sigma_3 = 10\), \(\sigma_4 = 2\), and all remaining singular values are zero. Find the smallest \(k\) such that the rank-\(k\) approximation retains at least \(99\%\) of the total energy. Compare the storage cost of the rank-\(k\) approximation with the storage cost of the full matrix.
Click to see the solution
Key Concept: The energy fraction captured by the top \(k\) singular values is \(\sum_{i=1}^k \sigma_i^2 / \sum_{i=1}^r \sigma_i^2\). We find the smallest \(k\) for which this fraction reaches \(99\%\), then compare storage costs.
Step 1 — Compute total energy.
\[E_{\text{total}} = \sigma_1^2 + \sigma_2^2 + \sigma_3^2 + \sigma_4^2 = 120^2 + 40^2 + 10^2 + 2^2 = 14400 + 1600 + 100 + 4 = 16104\]
Step 2 — Check \(k = 1\).
\[E_1 = 14400, \qquad \frac{E_1}{E_{\text{total}}} = \frac{14400}{16104} \approx 89.4\% < 99\%\]
Step 3 — Check \(k = 2\).
\[E_2 = 14400 + 1600 = 16000, \qquad \frac{E_2}{E_{\text{total}}} = \frac{16000}{16104} \approx 99.35\% \geq 99\%\ ✓\]
The smallest \(k\) that retains at least \(99\%\) of total energy is \(\boxed{k = 2}\).
Step 4 — Compare storage costs.
- Full matrix: \(1000 \times 800 = 800{,}000\) numbers.
- Rank-2 SVD: we store \(k\) left singular vectors (\(mk = 1000 \times 2 = 2000\) numbers), \(k\) right singular vectors (\(nk = 800 \times 2 = 1600\) numbers), and \(k\) singular values (\(2\) numbers). Total: \(k(m+n+1) = 2(1000+800+1) = 2 \times 1801 = 3602\) numbers.
- Compression ratio: \(3602 / 800{,}000 \approx 0.45\%\).
The rank-2 SVD requires less than \(0.5\%\) of the storage of the full matrix while capturing over \(99\%\) of the energy.
Step 5 — Other applications of this technique.
Beyond image compression, this approach is used in: recommendation systems (matrix factorization for collaborative filtering), natural language processing (latent semantic analysis), network analysis (compressing adjacency matrices), and any machine learning pipeline where the data matrix has low effective rank.
Answer: \(k = 2\) (energy fraction \(\approx 99.35\%\)). Storage: \(3{,}602\) numbers vs. \(800{,}000\) for the full matrix — a compression ratio of \(\approx 0.45\%\).
4.8 Full SVD and Best Rank-1 Approximation (Lab 8, Task 8)
Let \(A = \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{bmatrix}\). Compute the full SVD, identify all four fundamental subspaces, and write down the best rank-1 approximation \(A_1\).
Click to see the solution
Key Concept: For a \(2 \times 3\) matrix of rank 2, \(A^TA\) is \(3 \times 3\) and has one zero eigenvalue. We find all three eigenvectors of \(A^TA\), then compute \(U\) from the two non-zero singular pairs, and finally form \(A_1\) using the dominant term.
Step 1 — Compute \(A^TA\) (\(3\times 3\)).
\[A^TA = \begin{bmatrix}1&0\\1&1\\0&1\end{bmatrix}\begin{bmatrix}1&1&0\\0&1&1\end{bmatrix} = \begin{bmatrix}1&1&0\\1&2&1\\0&1&1\end{bmatrix}\]
Step 2 — Find eigenvalues of \(A^TA\).
\[\det(A^TA - \lambda I) = (1-\lambda)\bigl[(2-\lambda)(1-\lambda)-1\bigr] - \bigl[(1-\lambda)\bigr]\] \[= (1-\lambda)\bigl[\lambda^2-3\lambda+1\bigr] - (1-\lambda) = (1-\lambda)\bigl[\lambda^2-3\lambda\bigr] = (1-\lambda)\cdot\lambda(\lambda-3)\]
Eigenvalues: \(\lambda_1 = 3\), \(\lambda_2 = 1\), \(\lambda_3 = 0\). Thus \(\sigma_1 = \sqrt{3}\), \(\sigma_2 = 1\), \(\sigma_3 = 0\), and \(\text{rank}(A) = 2\).
Step 3 — Find right singular vectors (columns of \(V\)).
For \(\lambda_1 = 3\): \((A^TA - 3I)\mathbf{v} = 0\) gives the system \(-2v_1+v_2=0\), \(v_1-v_2+v_3=0\), \(v_2-2v_3=0\). From row-reduction: \(v_2 = 2v_3\) and \(v_1 = v_3\), so \(\mathbf{v}_1 \propto (1,2,1)^T\). \[\mathbf{v}_1 = \frac{1}{\sqrt{6}}\begin{bmatrix}1\\2\\1\end{bmatrix}\]
For \(\lambda_2 = 1\): \((A^TA - I)\mathbf{v} = 0\): \(v_2 = 0\) and \(v_1 + v_3 = 0\), so \(\mathbf{v}_2 \propto (1,0,-1)^T\). \[\mathbf{v}_2 = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\0\\-1\end{bmatrix}\]
For \(\lambda_3 = 0\): null space of \(A\): \(v_1+v_2=0\), \(v_2+v_3=0\), so \(\mathbf{v}_3 \propto (1,-1,1)^T\). \[\mathbf{v}_3 = \frac{1}{\sqrt{3}}\begin{bmatrix}1\\-1\\1\end{bmatrix}\]
Step 4 — Compute left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{\sqrt{3}}\cdot\frac{1}{\sqrt{6}}\begin{bmatrix}1&1&0\\0&1&1\end{bmatrix}\begin{bmatrix}1\\2\\1\end{bmatrix} = \frac{1}{\sqrt{18}}\begin{bmatrix}3\\3\end{bmatrix} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\1\end{bmatrix}\]
\[\mathbf{u}_2 = \frac{A\mathbf{v}_2}{\sigma_2} = \frac{1}{\sqrt{2}}\begin{bmatrix}1&1&0\\0&1&1\end{bmatrix}\begin{bmatrix}1\\0\\-1\end{bmatrix} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\-1\end{bmatrix}\]
Step 5 — Assemble SVD.
\[U = \frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}, \quad \Sigma = \begin{bmatrix}\sqrt{3}&0&0\\0&1&0\end{bmatrix}, \quad V = \begin{bmatrix}1/\sqrt{6}&1/\sqrt{2}&1/\sqrt{3}\\2/\sqrt{6}&0&-1/\sqrt{3}\\1/\sqrt{6}&-1/\sqrt{2}&1/\sqrt{3}\end{bmatrix}\]
Step 6 — Four fundamental subspaces.
- \(C(A) = \text{span}\{\mathbf{u}_1, \mathbf{u}_2\} = \mathbb{R}^2\) (full, dimension 2)
- \(N(A^T) = \{0\}\) (dimension 0, since \(m = r = 2\))
- \(C(A^T) = \text{span}\{\mathbf{v}_1, \mathbf{v}_2\}\) (dimension 2)
- \(N(A) = \text{span}\{\mathbf{v}_3\} = \text{span}\left\{\tfrac{1}{\sqrt{3}}(1,-1,1)^T\right\}\) (dimension 1)
Step 7 — Best rank-1 approximation.
\[A_1 = \sigma_1\,\mathbf{u}_1\mathbf{v}_1^T = \sqrt{3}\cdot\frac{1}{\sqrt{2}}\begin{bmatrix}1\\1\end{bmatrix}\cdot\frac{1}{\sqrt{6}}\begin{bmatrix}1&2&1\end{bmatrix} = \frac{\sqrt{3}}{\sqrt{12}}\begin{bmatrix}1&2&1\\1&2&1\end{bmatrix} = \frac{1}{2}\begin{bmatrix}1&2&1\\1&2&1\end{bmatrix}\]
Answer: \(\sigma_1 = \sqrt{3}\), \(\sigma_2 = 1\), \(\sigma_3 = 0\), rank \(= 2\). The best rank-1 approximation is \(A_1 = \tfrac{1}{2}\begin{bmatrix}1&2&1\\1&2&1\end{bmatrix}\).
4.9 Singular Values and Rank for Three Matrices (Assignment 8, Problem 1)
Compute the singular values and determine the rank of each of the following matrices: (a) \(A = \begin{bmatrix} 3 & 0 \\ 0 & -5 \end{bmatrix}\), (b) \(B = \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix}\), (c) \(C = \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix}\).
Click to see the solution
Key Concept: The singular values of \(A\) are the square roots of the eigenvalues of \(A^TA\). For diagonal or rank-deficient matrices these computations simplify considerably: diagonal entries give singular values directly (as absolute values), and rank-deficient matrices always have at least one zero singular value.
Part (a) — \(A = \begin{bmatrix}3&0\\0&-5\end{bmatrix}\).
Step 1: \(A^TA = \begin{bmatrix}3&0\\0&-5\end{bmatrix}^T\begin{bmatrix}3&0\\0&-5\end{bmatrix} = \begin{bmatrix}9&0\\0&25\end{bmatrix}\) (diagonal, so eigenvalues are \(\lambda_1=25\), \(\lambda_2=9\)).
Step 2: \(\sigma_1 = \sqrt{25} = 5\), \(\sigma_2 = \sqrt{9} = 3\). Note that singular values are always non-negative, even though \(A_{22} = -5\).
Step 3: Both singular values are positive, so \(\text{rank}(A) = 2\).
Part (b) — \(B = \begin{bmatrix}1&2\\2&4\end{bmatrix}\).
Step 1: Observe that row 2 \(= 2 \times\) row 1, so \(\text{rank}(B) = 1\).
Step 2: \(B^TB = \begin{bmatrix}1&2\\2&4\end{bmatrix}\begin{bmatrix}1&2\\2&4\end{bmatrix} = \begin{bmatrix}5&10\\10&20\end{bmatrix}\).
Step 3: \(\det(B^TB - \lambda I) = \lambda^2 - 25\lambda = \lambda(\lambda-25) = 0\), giving \(\lambda_1 = 25\), \(\lambda_2 = 0\).
Step 4: \(\sigma_1 = \sqrt{25} = 5\), \(\sigma_2 = 0\). \(\text{rank}(B) = 1\).
Part (c) — \(C = \begin{bmatrix}0&0\\0&0\end{bmatrix}\).
Step 1: \(C^TC = \begin{bmatrix}0&0\\0&0\end{bmatrix}\). All eigenvalues are \(0\).
Step 2: \(\sigma_1 = \sigma_2 = 0\). \(\text{rank}(C) = 0\).
Answer: (a) \(\sigma_1=5\), \(\sigma_2=3\), rank \(=2\). (b) \(\sigma_1=5\), \(\sigma_2=0\), rank \(=1\). (c) \(\sigma_1=\sigma_2=0\), rank \(=0\).
4.10 Full SVD of a \(3 \times 2\) Matrix (Assignment 8, Problem 2)
Compute the full SVD of \(A = \begin{bmatrix} 1 & 0 \\ 0 & 2 \\ 0 & 0 \end{bmatrix}\).
Click to see the solution
Key Concept: \(A\) is a tall (\(3\times 2\)) matrix with diagonal non-zero entries. Because \(A^TA\) is diagonal, the computation is straightforward. We must extend \(U\) to a full \(3\times 3\) orthogonal matrix by adding a third column spanning the null space of \(A^T\).
Step 1 — Compute \(A^TA\) (\(2\times 2\)).
\[A^TA = \begin{bmatrix}1&0&0\\0&2&0\end{bmatrix}\begin{bmatrix}1&0\\0&2\\0&0\end{bmatrix} = \begin{bmatrix}1&0\\0&4\end{bmatrix}\]
Step 2 — Eigenvalues and singular values.
\(A^TA = \text{diag}(1,4)\), so \(\lambda_1 = 4\), \(\lambda_2 = 1\). Thus \(\sigma_1 = 2\), \(\sigma_2 = 1\), rank \(= 2\).
Step 3 — Right singular vectors (columns of \(V\)).
Eigenvector for \(\lambda_1 = 4\): \(\mathbf{v}_1 = (0,1)^T\). Eigenvector for \(\lambda_2 = 1\): \(\mathbf{v}_2 = (1,0)^T\). \[V = \begin{bmatrix}0&1\\1&0\end{bmatrix}\]
Step 4 — Left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{2}\begin{bmatrix}0\\2\\0\end{bmatrix} = \begin{bmatrix}0\\1\\0\end{bmatrix}\]
\[\mathbf{u}_2 = \frac{A\mathbf{v}_2}{\sigma_2} = \frac{1}{1}\begin{bmatrix}1\\0\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\end{bmatrix}\]
Step 5 — Extend \(U\) to \(3\times 3\).
We need \(\mathbf{u}_3\) orthogonal to both \(\mathbf{u}_1 = (0,1,0)^T\) and \(\mathbf{u}_2 = (1,0,0)^T\). The only such unit vector is: \[\mathbf{u}_3 = \begin{bmatrix}0\\0\\1\end{bmatrix}\]
\[U = \begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\]
Step 6 — Assemble and verify.
\[A = U\Sigma V^T = \begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\begin{bmatrix}2&0\\0&1\\0&0\end{bmatrix}\begin{bmatrix}0&1\\1&0\end{bmatrix}\]
Verification: \(U\Sigma = \begin{bmatrix}0&1&0\\2&0&0\\0&0&0\end{bmatrix}\), then \(U\Sigma V^T = \begin{bmatrix}0&1&0\\2&0&0\\0&0&0\end{bmatrix}\begin{bmatrix}0&1\\1&0\end{bmatrix} = \begin{bmatrix}1&0\\0&2\\0&0\end{bmatrix} = A\). ✓
Answer: \(U = \begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\), \(\Sigma = \begin{bmatrix}2&0\\0&1\\0&0\end{bmatrix}\), \(V = \begin{bmatrix}0&1\\1&0\end{bmatrix}\); singular values \(\sigma_1 = 2\), \(\sigma_2 = 1\), rank \(= 2\).
4.11 Reduced SVD of a Tall Matrix (Assignment 8, Problem 3)
Compute the reduced (thin) SVD of \(A = \begin{bmatrix} 3 & 0 \\ 0 & 0 \\ 0 & 4 \end{bmatrix}\).
Click to see the solution
Key Concept: The reduced (thin) SVD keeps only the \(r\) non-zero singular triplets: \(A = \hat{U}\hat{\Sigma}\hat{V}^T\) where \(\hat{U}\) is \(m\times r\), \(\hat{\Sigma}\) is \(r\times r\) diagonal, and \(\hat{V}\) is \(n\times r\). For this \(3\times 2\) matrix with rank 2, the reduced SVD has the same \(V\) but \(U\) has only 2 columns (no third extension column needed).
Step 1 — Compute \(A^TA\) (\(2\times 2\)).
\[A^TA = \begin{bmatrix}3&0&0\\0&0&4\end{bmatrix}\begin{bmatrix}3&0\\0&0\\0&4\end{bmatrix} = \begin{bmatrix}9&0\\0&16\end{bmatrix}\]
Step 2 — Eigenvalues and singular values.
\(A^TA = \text{diag}(9,16)\), so \(\lambda_1 = 16\), \(\lambda_2 = 9\). Thus \(\sigma_1 = 4\), \(\sigma_2 = 3\), rank \(= 2\).
Step 3 — Right singular vectors (columns of \(\hat{V}\)).
Eigenvector for \(\lambda_1 = 16\): \(\mathbf{v}_1 = (0,1)^T\). Eigenvector for \(\lambda_2 = 9\): \(\mathbf{v}_2 = (1,0)^T\). \[\hat{V} = \begin{bmatrix}0&1\\1&0\end{bmatrix}\]
Step 4 — Left singular vectors (columns of \(\hat{U}\)).
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{4}\begin{bmatrix}3&0\\0&0\\0&4\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix} = \frac{1}{4}\begin{bmatrix}0\\0\\4\end{bmatrix} = \begin{bmatrix}0\\0\\1\end{bmatrix}\]
\[\mathbf{u}_2 = \frac{A\mathbf{v}_2}{\sigma_2} = \frac{1}{3}\begin{bmatrix}3\\0\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\end{bmatrix}\]
\[\hat{U} = \begin{bmatrix}0&1\\0&0\\1&0\end{bmatrix}\]
Step 5 — Assemble reduced SVD.
\[A = \hat{U}\hat{\Sigma}\hat{V}^T = \begin{bmatrix}0&1\\0&0\\1&0\end{bmatrix}\begin{bmatrix}4&0\\0&3\end{bmatrix}\begin{bmatrix}0&1\\1&0\end{bmatrix}\]
Verification: \(\hat{U}\hat{\Sigma} = \begin{bmatrix}0&3\\0&0\\4&0\end{bmatrix}\), then \(\begin{bmatrix}0&3\\0&0\\4&0\end{bmatrix}\begin{bmatrix}0&1\\1&0\end{bmatrix} = \begin{bmatrix}3&0\\0&0\\0&4\end{bmatrix} = A\). ✓
Answer: Reduced SVD: \(\hat{U} = \begin{bmatrix}0&1\\0&0\\1&0\end{bmatrix}\), \(\hat{\Sigma} = \begin{bmatrix}4&0\\0&3\end{bmatrix}\), \(\hat{V} = \begin{bmatrix}0&1\\1&0\end{bmatrix}\); singular values \(\sigma_1 = 4\), \(\sigma_2 = 3\).
4.12 SVD of a Rank-Deficient \(2 \times 2\) Matrix (Assignment 8, Problem 4)
Find the singular values, rank, and full SVD of \(A = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\).
Click to see the solution
Key Concept: Both rows of \(A\) are identical, so \(A\) has rank 1. There will be exactly one non-zero singular value. The SVD must be extended to a full \(2\times 2\) orthogonal \(U\) by choosing a vector orthogonal to \(\mathbf{u}_1\).
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix}1&1\\1&1\end{bmatrix}\begin{bmatrix}1&1\\1&1\end{bmatrix} = \begin{bmatrix}2&2\\2&2\end{bmatrix}\]
Step 2 — Eigenvalues and singular values.
\(\det(A^TA - \lambda I) = (2-\lambda)^2 - 4 = \lambda^2-4\lambda = \lambda(\lambda-4) = 0\). So \(\lambda_1 = 4\), \(\lambda_2 = 0\), giving \(\sigma_1 = 2\), \(\sigma_2 = 0\), rank \(= 1\).
Step 3 — Right singular vectors.
For \(\lambda_1 = 4\): \((A^TA - 4I)\mathbf{v} = 0 \Rightarrow -2v_1+2v_2=0 \Rightarrow v_1=v_2\). So \(\mathbf{v}_1 = \tfrac{1}{\sqrt{2}}(1,1)^T\).
For \(\lambda_2 = 0\): orthogonal complement: \(\mathbf{v}_2 = \tfrac{1}{\sqrt{2}}(1,-1)^T\).
\[V = \frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\]
Step 4 — Left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{2}\cdot\frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&1\end{bmatrix}\begin{bmatrix}1\\1\end{bmatrix} = \frac{1}{2\sqrt{2}}\begin{bmatrix}2\\2\end{bmatrix} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\1\end{bmatrix}\]
For \(\mathbf{u}_2\): orthogonal to \(\mathbf{u}_1\): \(\mathbf{u}_2 = \tfrac{1}{\sqrt{2}}(1,-1)^T\).
\[U = \frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\]
Step 5 — Assemble and verify.
\[A = U\Sigma V^T = \frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}2&0\\0&0\end{bmatrix}\frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\]
Verification: \(U\Sigma V^T = \tfrac{1}{2}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}2&0\\0&0\end{bmatrix}\begin{bmatrix}1&1\\1&-1\end{bmatrix} = \tfrac{1}{2}\begin{bmatrix}2&0\\2&0\end{bmatrix}\begin{bmatrix}1&1\\1&-1\end{bmatrix} = \begin{bmatrix}1&1\\1&1\end{bmatrix} = A\). ✓
Answer: \(\sigma_1 = 2\), \(\sigma_2 = 0\), rank \(= 1\). \(U = V = \tfrac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\), \(\Sigma = \begin{bmatrix}2&0\\0&0\end{bmatrix}\).
4.13 SVD of a Wide Matrix (Assignment 8, Problem 5)
Find the singular values, rank, and full SVD of \(A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \end{bmatrix}\).
Click to see the solution
Key Concept: \(A\) is a \(2\times 3\) wide matrix. Its \(A^TA\) is \(3\times 3\) and diagonal (since \(A\) itself is essentially diagonal with a trailing zero column). We must extend \(V\) to a full \(3\times 3\) orthogonal matrix by including the null-space vector.
Step 1 — Compute \(A^TA\) (\(3\times 3\)).
\[A^TA = \begin{bmatrix}1&0\\0&2\\0&0\end{bmatrix}\begin{bmatrix}1&0&0\\0&2&0\end{bmatrix} = \begin{bmatrix}1&0&0\\0&4&0\\0&0&0\end{bmatrix}\]
Step 2 — Eigenvalues and singular values.
\(A^TA = \text{diag}(1,4,0)\), so \(\lambda_1 = 4\), \(\lambda_2 = 1\), \(\lambda_3 = 0\). Thus \(\sigma_1 = 2\), \(\sigma_2 = 1\), \(\sigma_3 = 0\), rank \(= 2\).
Step 3 — Right singular vectors (columns of \(V\)).
Eigenvectors of \(A^TA\) (read off directly since it is diagonal):
\(\mathbf{v}_1 = (0,1,0)^T\) (for \(\lambda_1 = 4\)), \(\mathbf{v}_2 = (1,0,0)^T\) (for \(\lambda_2 = 1\)), \(\mathbf{v}_3 = (0,0,1)^T\) (for \(\lambda_3 = 0\)).
\[V = \begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\]
Step 4 — Left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{2}\begin{bmatrix}1&0&0\\0&2&0\end{bmatrix}\begin{bmatrix}0\\1\\0\end{bmatrix} = \frac{1}{2}\begin{bmatrix}0\\2\end{bmatrix} = \begin{bmatrix}0\\1\end{bmatrix}\]
\[\mathbf{u}_2 = \frac{A\mathbf{v}_2}{\sigma_2} = \frac{1}{1}\begin{bmatrix}1\\0\end{bmatrix} = \begin{bmatrix}1\\0\end{bmatrix}\]
\[U = \begin{bmatrix}0&1\\1&0\end{bmatrix}\]
Step 5 — Assemble SVD.
\[A = U\Sigma V^T = \begin{bmatrix}0&1\\1&0\end{bmatrix}\begin{bmatrix}2&0&0\\0&1&0\end{bmatrix}\begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\]
Verification: \(U\Sigma V^T = \begin{bmatrix}0&1\\2&0\end{bmatrix}\begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix} = \begin{bmatrix}1&0&0\\0&2&0\end{bmatrix} = A\). ✓
Answer: \(\sigma_1 = 2\), \(\sigma_2 = 1\), \(\sigma_3 = 0\), rank \(= 2\). \(U = \begin{bmatrix}0&1\\1&0\end{bmatrix}\), \(\Sigma = \begin{bmatrix}2&0&0\\0&1&0\end{bmatrix}\), \(V = \begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\).
4.14 True/False: SVD Properties (Assignment 8, Problem 6)
Determine whether each statement is true or false. Justify your answer.
Every square matrix has a unique SVD.
The singular values of \(A\) are always non-negative.
If \(A\) is symmetric positive definite, then its singular values equal its eigenvalues.
The rank of a matrix equals the number of non-zero singular values.
Click to see the solution
Key Concept: These questions test the core properties of singular values and the SVD. Remember: singular values are unique, but singular vectors (and hence the full factorization \(U\), \(V\)) are not.
Part (a) — “Every square matrix has a unique SVD.” — FALSE.
Step 1: Every matrix (square or not) does have an SVD, so existence is true.
Step 2: However, uniqueness fails. The singular values \(\sigma_i\) are unique (they are determined by the eigenvalues of \(A^TA\)), but the singular vectors are not. For example, if \(\sigma_i = \sigma_j\) (repeated singular values), any rotation in the corresponding eigenspace of \(A^TA\) yields another valid set of right singular vectors, and correspondingly another valid \(U\). Even for distinct singular values, changing the sign of \(\mathbf{v}_i\) and compensating in \(\mathbf{u}_i\) gives a different but valid SVD.
Conclusion: The SVD exists for every matrix but is not unique in general.
Part (b) — “The singular values of \(A\) are always non-negative.” — TRUE.
Step 1: Singular values are defined as \(\sigma_i = \sqrt{\lambda_i(A^TA)}\).
Step 2: The matrix \(A^TA\) is symmetric positive semi-definite: for any vector \(\mathbf{x}\), \(\mathbf{x}^T(A^TA)\mathbf{x} = \|A\mathbf{x}\|^2 \geq 0\). Therefore all eigenvalues of \(A^TA\) are \(\geq 0\).
Step 3: The square root of a non-negative number is non-negative, so \(\sigma_i = \sqrt{\lambda_i} \geq 0\) always.
Part (c) — “If \(A\) is symmetric positive definite, then its singular values equal its eigenvalues.” — TRUE.
Step 1: If \(A\) is symmetric positive definite (SPD), then \(A = Q\Lambda Q^T\) where \(\Lambda = \text{diag}(\lambda_1,\ldots,\lambda_n)\) with all \(\lambda_i > 0\).
Step 2: For a symmetric matrix, \(A^TA = A^2 = Q\Lambda^2 Q^T\). The eigenvalues of \(A^TA\) are \(\lambda_i^2\).
Step 3: Therefore \(\sigma_i = \sqrt{\lambda_i^2} = |\lambda_i| = \lambda_i\) (since all \(\lambda_i > 0\) for SPD).
Conclusion: The singular values and eigenvalues coincide for symmetric positive definite matrices.
Part (d) — “The rank of a matrix equals the number of non-zero singular values.” — TRUE.
Step 1: The rank of \(A\) equals the dimension of its column space \(C(A)\), which by the SVD equals the number of non-zero singular values \(r\) (those \(\sigma_i > 0\) contribute \(r\) independent columns \(\mathbf{u}_i = A\mathbf{v}_i/\sigma_i\) to \(C(A)\)).
Step 2: Formally: \(A = U\Sigma V^T\) and the zero singular values correspond to terms that vanish in the sum \(A = \sum_{i=1}^r \sigma_i\mathbf{u}_i\mathbf{v}_i^T\), confirming rank equals the number of non-zero \(\sigma_i\).
Answer: (a) False — singular values are unique but singular vectors are not, so the full SVD factorization is not unique. (b) True — singular values are square roots of eigenvalues of the PSD matrix \(A^TA\). (c) True — for SPD \(A\), \(\sigma_i = \lambda_i > 0\). (d) True — rank equals the number of strictly positive singular values.
4.15 SVD of a Symmetric Matrix (Assignment 8, Problem 7)
Compute the SVD of \(A = \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix}\).
Click to see the solution
Key Concept: \(A\) is a symmetric matrix, so its eigendecomposition gives the SVD directly: \(U = V = Q\) (the eigenvector matrix) and the singular values are the (positive) eigenvalues. This simplifies the computation significantly.
Step 1 — Compute \(A^TA\).
Since \(A = A^T\), we have \(A^TA = A^2\). But it is easier to diagonalize \(A\) directly.
Step 2 — Eigenvalues of \(A\).
\[\det(A - \lambda I) = (3-\lambda)^2 - 1 = \lambda^2 - 6\lambda + 8 = (\lambda-4)(\lambda-2) = 0\]
So \(\lambda_1 = 4\), \(\lambda_2 = 2\). Both are positive, confirming \(A\) is SPD.
Step 3 — Singular values.
\[\sigma_1 = \lambda_1 = 4, \qquad \sigma_2 = \lambda_2 = 2\]
Step 4 — Eigenvectors of \(A\) (= right and left singular vectors).
For \(\lambda_1 = 4\): \((A-4I)\mathbf{v} = 0 \Rightarrow -v_1+v_2=0 \Rightarrow v_1=v_2\). So \(\mathbf{v}_1 = \tfrac{1}{\sqrt{2}}(1,1)^T\).
For \(\lambda_2 = 2\): \((A-2I)\mathbf{v} = 0 \Rightarrow v_1+v_2=0 \Rightarrow v_2=-v_1\). So \(\mathbf{v}_2 = \tfrac{1}{\sqrt{2}}(1,-1)^T\).
\[Q = V = U = \frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\]
Step 5 — Verify using \(\mathbf{u}_i = A\mathbf{v}_i/\sigma_i\).
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{4}\cdot\frac{1}{\sqrt{2}}\begin{bmatrix}3&1\\1&3\end{bmatrix}\begin{bmatrix}1\\1\end{bmatrix} = \frac{1}{4\sqrt{2}}\begin{bmatrix}4\\4\end{bmatrix} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\1\end{bmatrix} = \mathbf{v}_1 \checkmark\]
\[\mathbf{u}_2 = \frac{A\mathbf{v}_2}{\sigma_2} = \frac{1}{2}\cdot\frac{1}{\sqrt{2}}\begin{bmatrix}3&1\\1&3\end{bmatrix}\begin{bmatrix}1\\-1\end{bmatrix} = \frac{1}{2\sqrt{2}}\begin{bmatrix}2\\-2\end{bmatrix} = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\-1\end{bmatrix} = \mathbf{v}_2 \checkmark\]
Step 6 — Assemble SVD.
\[A = Q\begin{bmatrix}4&0\\0&2\end{bmatrix}Q^T = \frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}4&0\\0&2\end{bmatrix}\frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\]
Verification: \(\tfrac{1}{2}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}4&0\\0&2\end{bmatrix}\begin{bmatrix}1&1\\1&-1\end{bmatrix} = \tfrac{1}{2}\begin{bmatrix}4&2\\4&-2\end{bmatrix}\begin{bmatrix}1&1\\1&-1\end{bmatrix} = \tfrac{1}{2}\begin{bmatrix}6&2\\2&6\end{bmatrix} = \begin{bmatrix}3&1\\1&3\end{bmatrix} = A\). ✓
Answer: \(\sigma_1 = 4\), \(\sigma_2 = 2\). \(U = V = \tfrac{1}{\sqrt{2}}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\), \(\Sigma = \begin{bmatrix}4&0\\0&2\end{bmatrix}\).
4.16 Rank and SVD from Known Singular Values (Assignment 8, Problem 8)
A matrix \(A\) has singular values \(\sigma_1 = 5\), \(\sigma_2 = 3\), \(\sigma_3 = 0\). (a) What is the rank of \(A\)? (b) Write the SVD of \(A^T\) in terms of the singular vectors of \(A\). (c) What are the singular values of \(2A\)?
Click to see the solution
Key Concept: Given the SVD \(A = U\Sigma V^T\), we can immediately read off properties of \(A^T\) and \(cA\) using the algebraic SVD formulas, without knowing \(U\), \(V\), or the shape of \(A\) explicitly.
Part (a) — Rank of \(A\).
Step 1: The rank of \(A\) equals the number of strictly positive singular values.
Step 2: We have \(\sigma_1 = 5 > 0\), \(\sigma_2 = 3 > 0\), \(\sigma_3 = 0\). Two singular values are positive.
\[\text{rank}(A) = 2\]
Part (b) — SVD of \(A^T\).
Step 1: If \(A = U\Sigma V^T\), then by the transpose formula: \[A^T = (U\Sigma V^T)^T = V\Sigma^T U^T\]
Step 2: \(\Sigma^T\) is the transpose of \(\Sigma\); for a square \(3\times 3\) diagonal \(\Sigma = \text{diag}(5,3,0)\), we have \(\Sigma^T = \Sigma\).
Step 3: Therefore \(A^T = V\Sigma U^T\). The left singular vectors of \(A^T\) are the columns of \(V\) (the right singular vectors of \(A\)), and the right singular vectors of \(A^T\) are the columns of \(U\) (the left singular vectors of \(A\)).
Step 4: The singular values of \(A^T\) are the same as those of \(A\): \(\sigma_1 = 5\), \(\sigma_2 = 3\), \(\sigma_3 = 0\).
Part (c) — Singular values of \(2A\).
Step 1: Use the scalar multiple formula: \(2A = U(2\Sigma)V^T\).
Step 2: \(U\) and \(V\) are unchanged. The singular values multiply by \(|2| = 2\):
\[\sigma_1(2A) = 2\times 5 = 10, \qquad \sigma_2(2A) = 2\times 3 = 6, \qquad \sigma_3(2A) = 2\times 0 = 0\]
Answer: (a) Rank \(= 2\). (b) \(A^T = V\Sigma U^T\) with the same singular values \(5, 3, 0\); the roles of \(U\) and \(V\) are swapped. (c) The singular values of \(2A\) are \(10, 6, 0\).
4.17 Compute \(A^TA\) and SVD for a Nilpotent Matrix (Assignment 8, Problem 9)
Let \(A = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}\). Compute \(A^TA\), find the singular values, and determine the full SVD.
Click to see the solution
Key Concept: \(A\) is a nilpotent matrix (\(A^2 = 0\)), so it is not diagonalizable by eigendecomposition. Nevertheless, the SVD exists for every matrix. Because \(A^TA\) is diagonal, the computation is straightforward.
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix}0&0\\1&0\end{bmatrix}\begin{bmatrix}0&1\\0&0\end{bmatrix} = \begin{bmatrix}0\cdot0+0\cdot0 & 0\cdot1+0\cdot0 \\ 1\cdot0+0\cdot0 & 1\cdot1+0\cdot0\end{bmatrix} = \begin{bmatrix}0&0\\0&1\end{bmatrix}\]
Step 2 — Eigenvalues and singular values.
\(A^TA = \text{diag}(0,1)\), so \(\lambda_1 = 1\), \(\lambda_2 = 0\). Thus: \[\sigma_1 = 1, \qquad \sigma_2 = 0, \qquad \text{rank}(A) = 1\]
Note: \(A\) has eigenvalues \(0, 0\) (nilpotent), but singular value \(\sigma_1 = 1 \neq 0\). Singular values and eigenvalues are completely different objects.
Step 3 — Right singular vectors (columns of \(V\)).
Eigenvector of \(A^TA\) for \(\lambda_1 = 1\): \(A^TA\,\mathbf{v} = \mathbf{v}\) gives \((0,v_2)^T = (v_1,v_2)^T\), so \(v_1 = 0\). Thus \(\mathbf{v}_1 = (0,1)^T\).
Eigenvector for \(\lambda_2 = 0\): \(A^TA\,\mathbf{v} = 0\) gives \(v_2 = 0\). Thus \(\mathbf{v}_2 = (1,0)^T\).
\[V = \begin{bmatrix}0&1\\1&0\end{bmatrix}\]
Step 4 — Left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{1}\begin{bmatrix}0&1\\0&0\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}1\\0\end{bmatrix}\]
For \(\mathbf{u}_2\): orthogonal to \(\mathbf{u}_1 = (1,0)^T\) in \(\mathbb{R}^2\): \(\mathbf{u}_2 = (0,1)^T\).
\[U = \begin{bmatrix}1&0\\0&1\end{bmatrix} = I_2\]
Step 5 — Assemble and verify.
\[A = U\Sigma V^T = \begin{bmatrix}1&0\\0&1\end{bmatrix}\begin{bmatrix}1&0\\0&0\end{bmatrix}\begin{bmatrix}0&1\\1&0\end{bmatrix}\]
Verification: \(\begin{bmatrix}1&0\\0&0\end{bmatrix}\begin{bmatrix}0&1\\1&0\end{bmatrix} = \begin{bmatrix}0&1\\0&0\end{bmatrix} = A\). ✓
Answer: \(A^TA = \begin{bmatrix}0&0\\0&1\end{bmatrix}\); \(\sigma_1 = 1\), \(\sigma_2 = 0\), rank \(= 1\). Full SVD: \(U = I_2\), \(\Sigma = \begin{bmatrix}1&0\\0&0\end{bmatrix}\), \(V = \begin{bmatrix}0&1\\1&0\end{bmatrix}\).
4.18 SVD of a Diagonal Matrix (Tutorial 8, Example 1)
Compute the full SVD of \(A = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}\).
Click to see the solution
Key Concept: For a diagonal matrix, \(A^TA\) is also diagonal, making the eigenvalue computation trivial. We follow the standard algorithm: find \(V\) from \(A^TA\), extract \(\Sigma\), then compute \(U\).
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}^T \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 4 \end{bmatrix}\]
Step 2 — Find eigenvalues of \(A^TA\).
Since \(A^TA = \text{diag}(1, 4)\) is already diagonal, the eigenvalues are read directly: \(\lambda_1 = 4\), \(\lambda_2 = 1\).
We list them in decreasing order: \(\lambda_1 = 4 \geq \lambda_2 = 1\).
Step 3 — Extract singular values.
\[\sigma_1 = \sqrt{\lambda_1} = \sqrt{4} = 2, \qquad \sigma_2 = \sqrt{\lambda_2} = \sqrt{1} = 1\]
So \(\Sigma = \begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}\).
Step 4 — Find right singular vectors (columns of \(V\)).
The eigenvector corresponding to \(\lambda_1 = 4\) is the standard basis vector \(\mathbf{e}_2 = (0, 1)^T\) (since \(A^TA\,\mathbf{e}_2 = 4\,\mathbf{e}_2\)). The eigenvector for \(\lambda_2 = 1\) is \(\mathbf{e}_1 = (1, 0)^T\).
\[\mathbf{v}_1 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}, \qquad \mathbf{v}_2 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \qquad V = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\]
Note: \(V = \begin{bmatrix} \mathbf{v}_1 & \mathbf{v}_2 \end{bmatrix}\).
Step 5 — Compute left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{2}\begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}\begin{bmatrix} 0 \\ 1 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 0 \\ 2 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\]
\[\mathbf{u}_2 = \frac{A\mathbf{v}_2}{\sigma_2} = \frac{1}{1}\begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}\begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\]
\[U = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\]
Step 6 — Assemble and verify.
\[A = U\Sigma V^T = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\]
Verification: \(U\Sigma V^T = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\begin{bmatrix} 0 & 2 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} = A\). ✓
Answer: \(A = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}^T\), with singular values \(\sigma_1 = 2\), \(\sigma_2 = 1\).
4.19 Rank and Subspace Dimensions from Singular Values (Tutorial 8, Example 2)
A matrix \(A \in \mathbb{R}^{5 \times 4}\) has singular values \(10, 5, 2\) (and one zero singular value). Determine the rank and the dimensions of all four fundamental subspaces.
Click to see the solution
Key Concept: The rank equals the number of non-zero singular values. The dimensions of all four fundamental subspaces follow from the rank-nullity theorem.
Step 1 — Determine the rank.
There are three non-zero singular values (\(10, 5, 2\)), so: \[r = \text{rank}(A) = 3\]
Step 2 — Find subspace dimensions.
The matrix is \(5 \times 4\), so \(m = 5\) and \(n = 4\).
- \(C(A)\) (column space, subspace of \(\mathbb{R}^5\)): \(\dim C(A) = r = 3\).
- \(N(A^T)\) (left null space, subspace of \(\mathbb{R}^5\)): \(\dim N(A^T) = m - r = 5 - 3 = 2\).
- \(C(A^T)\) (row space, subspace of \(\mathbb{R}^4\)): \(\dim C(A^T) = r = 3\).
- \(N(A)\) (null space, subspace of \(\mathbb{R}^4\)): \(\dim N(A) = n - r = 4 - 3 = 1\).
Interpretation: The SVD provides explicit orthonormal bases for all four subspaces: \(\mathbf{u}_1, \mathbf{u}_2, \mathbf{u}_3\) span \(C(A)\); \(\mathbf{u}_4, \mathbf{u}_5\) span \(N(A^T)\); \(\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\) span \(C(A^T)\); and \(\mathbf{v}_4\) spans \(N(A)\).
Answer: Rank \(= 3\); \(\dim C(A) = 3\), \(\dim N(A^T) = 2\), \(\dim C(A^T) = 3\), \(\dim N(A) = 1\).
4.20 SVD of a Rank-1 Matrix (Tutorial 8, Example 3)
Compute the SVD of \(A = \begin{bmatrix} 1 & 2 \\ 2 & 4 \\ 3 & 6 \end{bmatrix}\).
Click to see the solution
Key Concept: This matrix has rank 1 because each row is a scalar multiple of \((1, 2)\). The SVD of a rank-1 matrix has exactly one non-zero singular value. We follow the standard algorithm, then extend \(U\) and \(V\) to full orthogonal matrices.
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix} 1 & 2 & 3 \\ 2 & 4 & 6 \end{bmatrix}\begin{bmatrix} 1 & 2 \\ 2 & 4 \\ 3 & 6 \end{bmatrix} = \begin{bmatrix} 1+4+9 & 2+8+18 \\ 2+8+18 & 4+16+36 \end{bmatrix} = \begin{bmatrix} 14 & 28 \\ 28 & 56 \end{bmatrix}\]
Step 2 — Find eigenvalues of \(A^TA\).
\[\det(A^TA - \lambda I) = (14 - \lambda)(56 - \lambda) - 28^2 = \lambda^2 - 70\lambda + (14 \cdot 56 - 784)\]
\[= \lambda^2 - 70\lambda + (784 - 784) = \lambda^2 - 70\lambda = \lambda(\lambda - 70)\]
So \(\lambda_1 = 70\), \(\lambda_2 = 0\).
Step 3 — Extract singular values.
\[\sigma_1 = \sqrt{70}, \qquad \sigma_2 = 0\]
Only one non-zero singular value, confirming rank 1.
Step 4 — Find right singular vectors.
For \(\lambda_1 = 70\): solve \((A^TA - 70I)\mathbf{v} = 0\):
\[\begin{bmatrix} -56 & 28 \\ 28 & -14 \end{bmatrix}\mathbf{v} = \mathbf{0} \implies -2v_1 + v_2 = 0 \implies \mathbf{v}_1 = \frac{1}{\sqrt{5}}\begin{bmatrix} 1 \\ 2 \end{bmatrix}\]
For \(\lambda_2 = 0\): the null space of \(A^TA\) is orthogonal to \(\mathbf{v}_1\):
\[\mathbf{v}_2 = \frac{1}{\sqrt{5}}\begin{bmatrix} -2 \\ 1 \end{bmatrix}\]
(Check: \(\mathbf{v}_1 \cdot \mathbf{v}_2 = -2/5 + 2/5 = 0\). ✓)
\[V = \frac{1}{\sqrt{5}}\begin{bmatrix} 1 & -2 \\ 2 & 1 \end{bmatrix}\]
Step 5 — Compute the first left singular vector.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{\sqrt{70}}\begin{bmatrix} 1 & 2 \\ 2 & 4 \\ 3 & 6 \end{bmatrix}\frac{1}{\sqrt{5}}\begin{bmatrix} 1 \\ 2 \end{bmatrix} = \frac{1}{\sqrt{70}}\begin{bmatrix} 1+4 \\ 2+8 \\ 3+12 \end{bmatrix}\frac{1}{\sqrt{5}} = \frac{1}{\sqrt{70}}\begin{bmatrix} 5 \\ 10 \\ 15 \end{bmatrix}\cdot\frac{1}{\sqrt{5}}\]
\[= \frac{1}{\sqrt{70}\cdot\sqrt{5}}\begin{bmatrix} 5 \\ 10 \\ 15 \end{bmatrix} = \frac{5}{\sqrt{350}}\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} = \frac{1}{\sqrt{14}}\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}\]
Step 6 — Extend \(U\) to a full orthonormal basis of \(\mathbb{R}^3\).
We need \(\mathbf{u}_2, \mathbf{u}_3\) orthogonal to \(\mathbf{u}_1 = \tfrac{1}{\sqrt{14}}(1, 2, 3)^T\) and to each other. One valid choice (Gram–Schmidt or inspection):
\[\mathbf{u}_2 = \frac{1}{\sqrt{5}}\begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix}, \qquad \mathbf{u}_3 = \frac{1}{\sqrt{70}}\begin{bmatrix} -3 \\ -6 \\ 5 \end{bmatrix}\]
(These form an orthonormal set together with \(\mathbf{u}_1\).)
Step 7 — Assemble the SVD.
\[A = U\Sigma V^T = \begin{bmatrix} \mathbf{u}_1 & \mathbf{u}_2 & \mathbf{u}_3 \end{bmatrix} \begin{bmatrix} \sqrt{70} & 0 \\ 0 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} \mathbf{v}_1^T \\ \mathbf{v}_2^T \end{bmatrix}\]
Answer: \(\sigma_1 = \sqrt{70}\), \(\sigma_2 = 0\); rank \(= 1\). The reduced SVD is \(A = \sqrt{70}\,\mathbf{u}_1\mathbf{v}_1^T\).
4.21 Four Fundamental Subspaces from SVD (Tutorial 8, Example 4)
A \(4 \times 5\) matrix \(A\) has SVD with \(\Sigma = \text{diag}(8, 5, 2, 0)\) (as a \(4 \times 5\) matrix) and orthonormal left singular vectors \(\mathbf{u}_1, \mathbf{u}_2, \mathbf{u}_3, \mathbf{u}_4\) and right singular vectors \(\mathbf{v}_1, \ldots, \mathbf{v}_5\). Identify all four fundamental subspaces.
Click to see the solution
Key Concept: Given the SVD, the four fundamental subspaces are read directly from the singular vectors and the rank. No further computation is needed.
Step 1 — Determine the rank.
The diagonal of \(\Sigma\) is \((8, 5, 2, 0)\). There are three non-zero singular values, so: \[r = \text{rank}(A) = 3\]
Step 2 — Read off the four fundamental subspaces.
The matrix is \(4 \times 5\), so \(m = 4\), \(n = 5\).
- Column space \(C(A) \subseteq \mathbb{R}^4\), \(\dim = 3\): Spanned by \(\{\mathbf{u}_1, \mathbf{u}_2, \mathbf{u}_3\}\) — the left singular vectors corresponding to the non-zero singular values.
- Left null space \(N(A^T) \subseteq \mathbb{R}^4\), \(\dim = m - r = 1\): Spanned by \(\{\mathbf{u}_4\}\) — the left singular vector corresponding to \(\sigma_4 = 0\).
- Row space \(C(A^T) \subseteq \mathbb{R}^5\), \(\dim = 3\): Spanned by \(\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\}\) — the right singular vectors corresponding to the non-zero singular values.
- Null space \(N(A) \subseteq \mathbb{R}^5\), \(\dim = n - r = 2\): Spanned by \(\{\mathbf{v}_4, \mathbf{v}_5\}\) — the right singular vectors corresponding to the zero singular values.
Verification of dimensions: \(3 + 1 = 4 = m\) ✓ and \(3 + 2 = 5 = n\) ✓.
Orthogonality: \(C(A) \perp N(A^T)\) in \(\mathbb{R}^4\) (since \(U\) is orthogonal) and \(C(A^T) \perp N(A)\) in \(\mathbb{R}^5\) (since \(V\) is orthogonal). This is the Fundamental Theorem of Linear Algebra.
Answer: \(C(A) = \text{span}\{\mathbf{u}_1, \mathbf{u}_2, \mathbf{u}_3\}\); \(N(A^T) = \text{span}\{\mathbf{u}_4\}\); \(C(A^T) = \text{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\}\); \(N(A) = \text{span}\{\mathbf{v}_4, \mathbf{v}_5\}\).
4.22 SVD as a Sum of Rank-1 Matrices (Tutorial 8, Example 5)
Compute the full SVD of \(A = \begin{bmatrix} 2 & 0 \\ 0 & 0 \\ 0 & 1 \end{bmatrix}\) and express \(A\) as a sum of rank-1 matrices.
Click to see the solution
Key Concept: Once we have the SVD, the rank-1 expansion \(A = \sum_i \sigma_i \mathbf{u}_i \mathbf{v}_i^T\) is immediate. We verify the expansion by showing the sum reconstructs \(A\).
Step 1 — Compute \(A^TA\).
\[A^TA = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 2 & 0 \\ 0 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 4 & 0 \\ 0 & 1 \end{bmatrix}\]
Step 2 — Eigenvalues and singular values.
\(A^TA = \text{diag}(4, 1)\) is diagonal, so \(\lambda_1 = 4\), \(\lambda_2 = 1\).
\[\sigma_1 = 2, \qquad \sigma_2 = 1\]
Both are positive, so rank \(= 2\).
Step 3 — Right singular vectors.
\[\mathbf{v}_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix} \quad (\text{eigenvector of } A^TA \text{ for } \lambda = 4), \qquad \mathbf{v}_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix} \quad (\text{for } \lambda = 1)\]
\[V = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I_2\]
Step 4 — Left singular vectors.
\[\mathbf{u}_1 = \frac{A\mathbf{v}_1}{\sigma_1} = \frac{1}{2}\begin{bmatrix} 2 \\ 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\]
\[\mathbf{u}_2 = \frac{A\mathbf{v}_2}{\sigma_2} = \frac{1}{1}\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\]
Extend \(U\) to \(\mathbb{R}^3\): We need \(\mathbf{u}_3\) orthogonal to both \(\mathbf{u}_1 = (1,0,0)^T\) and \(\mathbf{u}_2 = (0,0,1)^T\). The only unit vector in \(\mathbb{R}^3\) orthogonal to both is:
\[\mathbf{u}_3 = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\]
\[U = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{bmatrix}\]
Step 5 — Write the SVD.
\[A = U\Sigma V^T = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{bmatrix}\begin{bmatrix} 2 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix}\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\]
Step 6 — Rank-1 expansion.
\[A = \sigma_1\,\mathbf{u}_1\mathbf{v}_1^T + \sigma_2\,\mathbf{u}_2\mathbf{v}_2^T = 2\begin{bmatrix}1\\0\\0\end{bmatrix}\begin{bmatrix}1&0\end{bmatrix} + 1\begin{bmatrix}0\\0\\1\end{bmatrix}\begin{bmatrix}0&1\end{bmatrix}\]
\[= \begin{bmatrix}2&0\\0&0\\0&0\end{bmatrix} + \begin{bmatrix}0&0\\0&0\\0&1\end{bmatrix} = \begin{bmatrix}2&0\\0&0\\0&1\end{bmatrix} = A \quad \checkmark\]
Answer: \(\sigma_1 = 2\), \(\sigma_2 = 1\); \(A = 2\,\mathbf{u}_1\mathbf{v}_1^T + 1\,\mathbf{u}_2\mathbf{v}_2^T\).
4.23 Prove the Eigenvalue Connection \(A^TA = V\Sigma^T\Sigma V^T\) (Tutorial 8, Example 6)
Let \(A = U\Sigma V^T\) be the SVD of \(A\). Prove that \(A^TA = V\Sigma^T\Sigma V^T\) and deduce that the eigenvalues of \(A^TA\) are \(\sigma_i^2\).
Click to see the solution
Key Concept: This proof directly establishes the connection between SVD and eigendecomposition, showing why \(A^TA\) is the right matrix to compute when finding singular values. The key tools are the transpose of a product and the orthogonality \(U^TU = I\).
Step 1 — Expand \(A^TA\) using the SVD.
\[A^TA = (U\Sigma V^T)^T(U\Sigma V^T)\]
Using the property \((XYZ)^T = Z^TY^TX^T\):
\[(U\Sigma V^T)^T = (V^T)^T\Sigma^T U^T = V\Sigma^T U^T\]
Therefore:
\[A^TA = (V\Sigma^T U^T)(U\Sigma V^T)\]
Step 2 — Simplify using \(U^TU = I\).
Since \(U\) is orthogonal, \(U^TU = I\). Thus:
\[A^TA = V\Sigma^T (U^T U)\Sigma V^T = V\Sigma^T I\,\Sigma V^T = V(\Sigma^T\Sigma)V^T\]
Step 3 — Interpret as eigendecomposition.
The matrix \(\Sigma^T\Sigma\) is diagonal: if \(\Sigma \in \mathbb{R}^{m \times n}\) has diagonal entries \(\sigma_1, \ldots, \sigma_r\) (and zeros elsewhere), then \(\Sigma^T\Sigma \in \mathbb{R}^{n \times n}\) is the diagonal matrix \(\text{diag}(\sigma_1^2, \sigma_2^2, \ldots, \sigma_r^2, 0, \ldots, 0)\).
The expression \(V(\Sigma^T\Sigma)V^T\) is exactly the eigendecomposition of \(A^TA\): it is in the form \(Q\Lambda Q^T\) with \(Q = V\) orthogonal and \(\Lambda = \Sigma^T\Sigma\) diagonal.
Conclusion: The eigenvalues of \(A^TA\) are the diagonal entries of \(\Sigma^T\Sigma\), which are \(\sigma_1^2, \sigma_2^2, \ldots, \sigma_r^2, 0, \ldots, 0\). In particular, all eigenvalues of \(A^TA\) are non-negative — which confirms that \(A^TA\) is positive semi-definite for any matrix \(A\).
Answer: \(A^TA = V(\Sigma^T\Sigma)V^T\); eigenvalues of \(A^TA\) are \(\sigma_i^2 \geq 0\).
4.24 Image Compression via SVD (Tutorial 8, Example 7)
A grayscale image is represented as a matrix \(M \in \mathbb{R}^{500 \times 400}\). Suppose the rank of \(M\) is \(400\) (all singular values are non-zero). (a) How much storage does the full matrix require? (b) Compute the storage cost for the rank-3 SVD approximation. (c) What is the compression ratio?
Click to see the solution
Key Concept: The rank-\(k\) SVD approximation stores \(k\) left singular vectors (each of length \(m\)), \(k\) right singular vectors (each of length \(n\)), and \(k\) singular values — a total of \(k(m + n + 1)\) numbers, which can be dramatically less than the \(mn\) entries of the full matrix.
Step 1 — Full matrix storage.
\[\text{Storage}_{\text{full}} = m \times n = 500 \times 400 = 200{,}000 \text{ numbers}\]
Step 2 — Rank-3 approximation storage.
For \(k = 3\), \(m = 500\), \(n = 400\):
\[\text{Storage}_{k=3} = k(m + n + 1) = 3(500 + 400 + 1) = 3 \times 901 = 2{,}703 \text{ numbers}\]
Breaking this down:
- Singular values: \(k = 3\) numbers
- Left singular vectors: \(km = 3 \times 500 = 1{,}500\) numbers
- Right singular vectors: \(kn = 3 \times 400 = 1{,}200\) numbers
- Total: \(3 + 1{,}500 + 1{,}200 = 2{,}703\) numbers ✓
Step 3 — Compression ratio.
\[\text{Compression ratio} = \frac{\text{Storage}_{k=3}}{\text{Storage}_{\text{full}}} = \frac{2{,}703}{200{,}000} \approx 0.0135 = 1.35\%\]
The rank-3 approximation uses only about \(1.35\%\) of the storage of the full matrix. Whether this approximation is visually acceptable depends on how much energy the top 3 singular values capture — if the image has strong low-rank structure (e.g., smooth backgrounds), the approximation may look quite good.
Answer: Full storage: \(200{,}000\); rank-3 storage: \(2{,}703\); compression ratio \(\approx 1.35\%\).